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1.  Introduction 

The  improvement  of  microprocessor  speed  and  network  connectivity  has 
made  data  processing  on  microprocessor  workstations  more  efficient  and 
economical.  With  the  maturity  of  operating  systems  and  network  software, 
it  becomes  feasible  to  distribute  the  required  processing  to  individual 
workstations  or  personal  computers  that  are  networked  together  to 
achieve  concurrent  processing  and  reduce  turnaround  time.  This  report 
describes  a  distributed  computing  technique  (DCT)  employed  in  a  UNIX 
system  environment. 


2.  Environment 

First,  let's  examine  the  system  architecture  in  question,  which  accommo¬ 
dates  distributed  computing.  Fifteen  RISC  (reduced  instruction  set  chip) 
workstations  make  up  the  hardware.  They  are  configured  and  tuned  for 
engineering,  scientific,  and  graphic  applications.  As  depicted  in  figure  1, 
all  workstations  are  networked  together  by  an  Ethernet^'^  local  area  net¬ 
work  (LAN)  operating  with  a  data  rate  of  10  Mbits/ s  and  a  gateway  to  the 
Adelphi  Local  Area  Digital  Data  Interchange  Network  (ALADDIN)  at  the 
Army  Research  Laboratory  (ARL)  and  to  the  Defense  Research  Engineer¬ 
ing  Network  (DREN).  The  workstations  include  three  SGI  Power  IRIS  4D/ 
440  systems  (with  four  processors  each),  nine  SGI  Personal  IRIS  4D/35  sys¬ 
tems,  two  IBM  Power  Stations  RS/6000-560,  and  one  IBM  Power  Station 
RS/6000-530.  This  hardware  makes  up  the  main  part  of  the  system,  called 
the  Electromagnetic  Effects  Modeling  System  (EeMS),  which  was  designed 


Figure  1. 
Electromagnetic 
Effects  Modeling 
System  (EeMS). 


FDDI  =  fiber  distributed  data  interface 
SO  =  serial  optical  channel  converter  network 
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to  provide  fast  engineering  workstations  at  the  desks  of  certain  scientists 
and  engineers  (S&Es)  at  ARL.  All  workstations  run  variations  of  the 
UNIX^'^  operating  system  (SGI  and  IBM  AIX"'''*’)  and  use  the  TCP/ 

IP  protocol  for  networking.  (Both  IRIX  and  AIX  are  derivatives  of  AT&T 
UNIX  and  Berkeley  UNIX.) 

The  total  disk  space  in  the  EeMS  is  50  gigabytes.  Using  the  Network  File 
System  (NFS™)  software  and  with  a  special  arrangement  of  file  systems, 
users  can  access  their  files  from  any  system  in  the  EeMS  with  the  same  file 
path.  For  example,  say  a  user  (userl)  has  a  data  file  on  the  "emsal"  system 
with  the  file  path  /home/emsal /userl /datafile:  even  when  logged  into 
another  system  (say,  "emsc5"),  that  user  can  access  that  data  file  on  emsal 
with  the  same  file  path:  /home/emsal /userl /datafile. 

Automount,  a  feature  of  NFS,  is  used  to  reduce  network  traffic.  The 
automount  feature  will  mount  remote  file  systems  when  they  are  needed. 
When  the  file  systems  are  not  used  for  a  specific  time  (specified  by  the  sys¬ 
tem  administrator),  automount  will  unmount  the  remote  file  systems. 
With  this  arrangement,  users'  application  programs  can  access  all  users' 
files  in  the  EeMS  using  the  same  file  path  from  any  system  in  the  EeMS. 

Network  Information  Service  (NIS^*^)  software  is  employed  to  provide  ev¬ 
ery  user  a  unique  user  identification  (uid)  on  every  workstation  in  the 
EeMS.  This  special  arrangement  of  unique  uids  for  users  of  the  EeMS  has 
alleviated  problems  associated  with  authentication  and  authorization  of 
reading,  writing,  and  executing  users'  files  and  programs  but  still  main¬ 
tained  the  necessary  security  for  the  operation  of  the  EeMS. 


3.  Technique 

By  taking  advantage  of  services  provided  by  NFS  and  NIS  software  and 
the  special  design  and  arrangement  of  file  systems,  the  computer  environ¬ 
ment  of  the  EeMS  enables  and  facilitates  distributed  processing  among  all 
EeMS  processors.  A  distributed  computing  technique  was  implemented  to 
concurrently  process  information  on  all  workstations  of  the  EeMS.  The 
technique  requires  the  following  components: 

•  data  pool  file, 

•  index  file,  and 

•  getind  function  (for  providing  an  index  for  the  next  available  data  set). 

3.1  Data  Pool  File 

The  data  pool  file  is  a  collection  of  the  input  data  needed  by  DCT  processes 
to  compute  a  result.  A  process  is  a  program  submitted  to  a  processor  for 
execution.  The  input  data  should  be  grouped  into  sets  (or  records)  and 
must  be  quantifiable.  Each  data  set  is  the  smallest  amount  of  input  data 
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needed  to  produce  a  result.  The  data  pool  file  must  be  accessible  to  all  DCT 
processes,  either  through  a  local  file  system  or  a  remote  file  system. 

3.2  Index  File 

The  index  file  contains  the  index  number  of  the  last  used  data  set  in  the 
data  pool  file,  as  well  as  flags  to  signal  all  DCT  processes  or  certain  DCT 
processes  at  certain  hosts  to  terminate.  The  index  file  must  also  be  acces¬ 
sible  to  all  DCT  processes,  either  as  a  local  file  or  through  a  network- 
remote  file.  While  the  index  file  is  being  accessed,  the  calling  process  must 
lock  the  index  file  by  using  a  software  lock  mechanism,  such  as  NFS  lock,  a 
feature  provided  in  the  UNIX  NFS.  When  the  calling  process  locks  the  in¬ 
dex  file,  it  has  the  exclusive  right  to  write  and  modify  the  index  file.  When 
another  calling  process  tries  to  lock  an  index  file  that  is  already  locked,  the 
calling  process  will  be  put  on  hold  until  a  predefined  time  is  up,  and  then 
it  will  terminate. 

An  example  of  the  contents  of  the  index  file  is 
124  0 

emsal.arl.mil  3 
emsc2.arl.mil  1 

The  first  number  of  the  first  line  is  the  index  of  the  last  used  data  set  in  the 
data  pool  file.  The  second  number  on  the  same  line  is  the  all-process¬ 
terminating  flag,  which  signals  all  processes  to  terminate.  The  value  of  1 
for  the  all-process-terminatmg  flag  means  "terminate,"  and  the  value  of  0 
means  "proceed."  Any  lines  after  the  first  line  are  the  host-process¬ 
terminating  flags,  which  have  the  format  of  a  hostname  and  number  of 
processes.  The  process  that  is  executed  by  the  host  system  matching  the 
hostname  of  the  host-process-terminating  flags  will  decrement  by  1  the 
number  of  processes  in  this  line  and  terminate.  The  number  of  processes 
can  be  greater  than  1,  because  some  host  systems  can  execute  more  than 
one  process  on  the  DCT.  For  example,  the  emsal  system  has  four  proces¬ 
sors,  so  it  can  launch  four  processes  to  execute  four  data  sets  at  the  same 
time.  The  index  file  can  be  manually  modified  so  that  the  terminating  flags 
are  set.  During  the  manual  modification  process,  the  index  file  should  be 
locked. 

3.3  getind  Function 

The  purpose  of  the  getind  function  is  to  provide  the  index  to  the  next  avail¬ 
able  data  set  in  the  data  pool  file  and  also  to  interpret  the  terminating  flags 
set  in  the  index  file.  The  getind  function  will  first  try  to  lock  the  index  file. 
Each  time  it  fails  to  lock  the  index  file,  it  will  wait  for  a  preset  period  and 
then  try  to  lock  again.  If  the  number  of  tries  exceeds  the  preset  number,  the 
getind  fimction  will  quit  and  signal  the  process  to  terminate.  After  success¬ 
fully  locking  the  index  file,  the  getind  function  checks  the  last  used  index 
for  end-of-record  and  terminating  flags.  If  all  the  terminating  conditions 
are  negative  or  not  applicable  to  the  calling  process,  then  the  getind  func- 
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tion  will  increase  the  index  value  to  the  next  available  index  value,  update 
the  index  file,  and  return  the  next  available  index  value  to  the  calling 
process.  Upon  returning  to  the  calling  process,  the  getind  function  unlocks 
the  index  file. 

An  example  of  the  input  parameters  of  the  getind  function,  listed  in  appen¬ 
dix  A,  is  as  follows: 

•  the  maximum  number  of  input  data  sets, 

•  the  index  file  name, 

•  the  hostname  of  the  calling  process. 

Written  in  C  language,  this  getind  function  used  the  "include"  files 
unistd.h  and  fcntl.h.  The  getind  function  will  try  to  lock  the  index  file  25 
times.  Each  time  it  fails,  it  will  sleep  for  5  s  or  less  and  retry  again,  for  a  to¬ 
tal  of  25  attempts  (this  can  be  varied  depending  on  the  user's  require¬ 
ments).  If  a  hostname  of  the  calling  process  appears  on  any  of  the  host- 
process-terminating  flags,  then  the  getind  funchon  will  decrement  by  1  the 
number  of  processes  and  return  a  flag  value  of  -2,  to  inform  the  calling 
process  to  terminate.  After  this  decrement,  if  the  number  of  processes  is  0, 
then  the  getind  function  will  remove  this  line.  The  getind  function  will  re¬ 
turn  the  value  of  the  index  of  the  next  available  data  set,  a  value  of  0  (if  the 
index  number  reaches  the  maximum  index  number),  or  a  value  of  -1  (if  the 
all-process- terminating  flag  value  is  1). 


4.  Algorithm 


For  implementation,  this  distributed  computing  technique  requires  the  fol¬ 
lowing  arrangement: 

•  The  index  file  and  data  pool  file  must  be  accessible  to  the  DCT  processes. 
The  application  program  should  be  able  to  determine  the  number  of  data 
sets  contained  in  the  data  pool  file. 

•  There  must  be  a  software  lock  mechanism,  such  as  NFS  lock,  to  inform  or 
prevent  other  processes  from  modifying  the  index  file  while  a  process  is 
using  it.  The  file  lock  mechanism  should  work  across  the  network  and  also 
in  a  heterogeneous  system  environment,  consisting  of  systems  from  differ¬ 
ent  vendors. 

•  The  application  program  should  be  structured  in  such  a  way  that  each  run 
of  a  data  set  yields  a  result.  The  execution  of  one  data  set  snould  be  inde¬ 
pendent  of  the  execution  of  any  other  data  set  in  the  data  pool. 

In  order  to  obtain  an  input  data  set  for  execution,  the  application  program 
will  invoke  the  getind  function  to  obtain  the  index  of  the  next  available 
data  set  in  the  data  pool  file.  To  use  the  DCT,  the  user  will  start  or  submit 
the  application  program  on  all  available  systems. 
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5.  Performance,  Problems,  and  Discussion 


5.1  Performance 

In  an  effort  to  speed  up  turnaround  time,  the  technique  described  here  was 
incorporated  into  an  application  program  (listed  in  app  A)  used  to  com¬ 
pute  the  electric  and  magnetic  field  of  a  point  in  space  radiated  by  an  elec¬ 
tromagnetic  pulse  (EMP)  simulator.*  In  this  case,  a  data  set  is  the  coordi¬ 
nate  of  the  point  in  space.  The  DCT  used  all  processors  in  the  EeMS:  12 
processors  in  three  SGI  4D/440  systems,  9  processors  in  nine  SGI  4D/35 
systems,  2  processors  in  two  IBM  RS/ 6000-560  systems,  and  1  processor  in 
an  IBM  RS/ 6000-530  system.  All  parts  of  this  program  were  written  in  C 
programming  language. 

Table  1  summarizes  and  allows  comparison  of  the  performance  for  this 
specific  EM  application  using  the  DCT.  Column  2  refers  to  the  LINPACK 
benchmark  for  the  normal  mode  of  system  operation.  The  Mflops  bench¬ 
mark  is  based  on  the  LINPACK  code  of  200x200  array  elements.  The 
LINPACK  code  was  linpack.c,  retrieved  from  the  netlib.att.com  machine 
on  the  Internet.  The  LINPACK  benchmark  in  C  programming  language 
was  used,  since  the  EM  application  was  written  in  the  C  language  (for 
more  details,  see  app  B).  The  execution  time  required  to  calculate  the  EM 
fields  for  one  data  set  is  shown  in  column  3.  The  average  execution  time 
for  one  run  is  based  on  the  average  execution  time  of  three  t3q)ical  observa¬ 
tion  points.  All  the  execution  times  referred  to  here  are  based  on  the  nor¬ 
mal  operation  of  systems  in  the  EeMS  and  assume  that  few  or  no  other  us¬ 
ers'  processes  were  using  the  systems  beside  these  distributed  computing 
processes.  See  appendix  C  for  more  details.  Column  4  shows  the  applica¬ 
tion  benchmark  times  normalized  to  the  performance  of  the  SGI  4D/35 
system,  the  slowest  system  in  the  EeMS.  The  estimated  execution  time  re¬ 
quired  to  calculate  the  EM  fields  at  1000  observation  points  is  shown  in  the 
last  column.  For  this  particular  application,  use  of  the  DCT  provides  a  re¬ 
duction  in  execution  time  of  almost  a  factor  of  10  compared  to  an  IBM  RS/ 
560. 


Table  1.  Summary  of 
benchmarks  and 
comparisons. 


System 

LINPACK 

benchmark 

(Mflops) 

Average 
execution 
time  for  1 
data  set 
(hh:mm:ss) 

Comparison 
to  SGI4D/35 
system 

Estimated 
execution 
time  for  1000 
data  sets 
(days) 

SGI4D/35 

3.88 

10:39:25 

1.00000 

445 

SGI  4D/440 

4.33 

9:44:28 

1.09402 

406 

IBM  RS/530 

11.69 

7:39:45 

1.39079 

320 

IBM  RS/560 

22.80 

3:47:52 

2.80610 

159 

15  systems 
with  DCT 

N/A 

N/A 

29.13120 

16 

^  Brian  B.  Luu  and  Calvin  D.  Le,  AESOP  Field  Prediction,  Army  Research  Laboratory,  ARL-TR-835 
(November  1995) 
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5.2  Problems 

For  the  heterogeneous  system  environment,  the  data  format  incompatibil¬ 
ity  of  the  index  and  data  pool  file  can  prevent  systems  from  different  ven¬ 
dors  from  correctly  reading  and  processing  the  data  sets.  For  example, 
DECstation  5000  can  read  the  index  file  (since  the  index  file  is  in  ASCII), 
but  it  cannot  correctly  read  the  data  in  the  data  pool  file,  because  the  data 
pool  file  is  in  the  binary  floating  point  format  for  the  SGI  and  IBM  systems 
in  the  EeMS.  A  common  data  format  (e.g.,  ASCII)  for  all  systems  should  be 
used  for  the  data  pool  file,  the  index  file,  and  also  the  result  data  file.  Oth¬ 
erwise,  special  input/ output  functions  must  be  written  for  an  application 
program  to  handle  incompatible  data  formats. 

During  the  course  of  execution,  the  systems  that  provide  file  server  service 
for  the  index  and  data  pool  file  must  be  operating  at  all  times  to  provide 
the  indexes  and  data  sets  for  processes  to  execute.  A  power  failure  will  dis¬ 
rupt  the  distributed  computing  process — especially  a  power  failure  to  sys¬ 
tems  that  provide  the  NFS  services  for  the  index  and  data  pool  file.  The  ter¬ 
minating  flags  in  the  index  file  can  be  used  to  properly  terminate  the 
processes  if  a  power  shutdown  is  expected  or  planned.  If  a  power  failure 
improperly  terminates  the  distributed  computing  processes,  the  informa¬ 
tion  in  the  index  file  will  be  used  to  restart  the  DCT  process  at  the  last  un¬ 
processed  index  data  sets.  Indexing  of  input  data  along  with  the  DCT  has 
saved  the  execution  time  that  would  be  required  to  rerim  data  sets  already 
completed.  Using  uninterruptible  power  supplies  (UPSs)  can  mitigate  un¬ 
expected  power  failures  for  systems.  Note  that  the  UPS  for  systems  that 
provide  the  network  services  (such  as  the  bridge,  router,  and  NFS  file  serv¬ 
ers  for  the  index  and  data  pool  file)  should  outlast  the  UPSs  for  other 
systems. 

5.3  Discussion 

If  all  the  results  are  not  urgently  needed,  a  lower  priority  can  be  set  for  the 
DCT  processes  so  that  the  impact  on  other  users'  jobs  is  minimized.  With  a 
low  execution  priority  status,  the  DCT  processes  will  be  put  in  a  waiting 
queue  or  use  a  low  proportion  of  CPU  cycles  compared  to  other  users' 
processes  with  normal  priority.  But  in  the  evening,  when  there  are  few  or 
no  other  users'  jobs  running,  the  DCT  processes  will  use  all  the  available 
CPU  cycles. 

With  the  termination  controls  in  the  index  file,  the  DCT  offers  flexibility  for 
participating  systems.  Not  all  15  systems  of  the  EeMS  have  to  be  active  at 
all  times  for  the  DCT.  Some  systems  can  be  released  from  the  DCT  and  re¬ 
activated  at  a  later  time.  More  systems  participating  in  the  DCT  will  result 
in  shorter  completion  time  for  all  the  input  data  sets. 

To  alter  the  course  of  the  DCT  (e.g.,  terminating  some  or  all  distributed 
computing  processes)  without  incurring  improper  termination,  a  lead  time 
is  needed.  An  adequate  lead  time  requires  an  amovmt  of  time  that  is  equal 
to  or  greater  than  the  amount  of  execution  time  for  one  data  set  on  the 
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slowest  system.  But  this  required  lead  time  can  be  longer,  depending  on 
the  CPU  load  of  the  slowest  system  at  that  time. 

Many  factors  contribute  to  an  uncertainty  in  the  prediction  of  the  comple¬ 
tion  time  of  the  DCT  process.  The  uncontrollable  factors  are  CPU  load,  net¬ 
work  speed,  network  traffic,  and  unexpected  events  (e.g.,  power  failure, 
system  failure).  But  the  amount  of  execution  time  for  one  data  set  on  the 
slowest  system  is  the  determining  factor  in  this  uncertainty. 

For  maximum  benefit  from  the  DCT,  the  minimum  number  of  data  sets  in¬ 
volved  should  be  greater  than  the  sum  of  the  performance  times  normal¬ 
ized  to  the  performance  time  of  the  slowest  processor  (as  illustrated  in  col¬ 
umn  4  of  table  1)  of  the  participating  processors.  Obviously,  the  minimum 
number  of  data  sets  should  be  greater  than  the  number  of  processors.  For 
example,  for  the  maximum  benefit  of  using  DCT  on  all  systems  in  the 
EeMS,  the  minimum  number  of  data  sets  in  this  application  should  be  29. 
But  for  the  DCT  performed  on  one  processor  of  an  SGI  4D/35,  one  proces¬ 
sor  of  an  SGI  4D/440,  one  processor  of  an  IBM  RS/530,  and  one  processor 
of  an  IBM  RS/560,  the  minimum  number  of  data  sets  should  be  6. 


6.  Conclusion 

As  implemented  and  tested,  this  distributed  computing  technique  has 
demonstrated  its  utility  for  applications  in  which  a  large  task  can  be  di¬ 
vided  into  many  small  tasks,  and  each  small  task  executed  independently 
on  any  available  system  on  the  network.  Usually,  these  large  applications, 
as  in  electromagnetics  or  acoustics,  require  supercomputer  capability, 
which  has  very  limited  flexibility.  The  distributed  computing  technique  is 
suitable  for  a  system  environment  consisting  of  microprocessor  worksta¬ 
tions  networked  together.  This  type  of  system  environment  is  now  emerg¬ 
ing  in  the  industry:  for  example,  networking  of  personal  computer  (PC) 
Pentium  workstations  that  are  equipped  with  high-level  operating  sys¬ 
tems  and  network  software.  This  environment  is  inexpensive  and  flexible, 
requires  less  system  administration,  and  eliminates  the  chance  of  single  fo¬ 
cal  operation  failure,  which  could  happen  on  a  centralized  system  environ¬ 
ment  like  a  mainframe  or  multiprocessor  supercomputer.  The  fundamen¬ 
tal  requirements  to  implement  the  distributed  computing  technique  are  a 
network  of  systems  and  file  server  capability. 
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Appendix  A.  Application  Program  Listing 

/*  program  clob.c 
date:  24  Jan  1995 
Author:  Brian  B.  Luu 

Description:  This  C  programming  code  will  read  in  XYZ  coordinates  of  points 
in  space  and  compute  the  estimated  electric  and  magnetic  field  radiated  by  AESOP 
(Army  Electromagnetic  Pulse  Simulator  Operation)  at  these  locations. 

The  coefficents  of  the  current  distribution  on  each  dipole  segment 
are  precomputed  and  saved  in  a  file  which  is  hardcoded  in  the  program  with  the 
file  name  " /home/emsc3 /bluu/aesop/CUCOEF . DATA" . 

All  the  input  and  output  formats  of  the  program  are  in  SGI-IRIX  or 
IBM-AIX  binary-floating-point  format  except  the  "data  pool"  file  which  is  in  ASCII. 

*/ 

#include  <stdio,h> 

#include  <math.h> 

#define  MAXT  2000 
#define  MAXEH  16008 
#define  MAXS  14978 
#define  MAXSH  10040 
#define  MAXCO  299580 
#define  MAXS2  29956 

double  cue,  cup; 
double  CO [MAXCO]; 

double  ters,  snO ,  snl,  rnO,  ml; 

main{int  arge,  char  *argv[])  /*  Main  program  */ 

{ 

/*  Initialized  data  */ 
char  spe; 

char  dscuco [ ] = " /home/emsc3 /bluu/aesop/CUCOEF . DATA" ; 
char  findex[100],  hn[64]; 
char  dseht [160 ] ="mkdir  "; 

struct  {  double  hdd[24];  int  hdi[8];  }  hdr; 

int  i,  ib; 
int  it,  its; 

int  is,  isb,  isl,  isgx,  isgz; 

int  isg [4 ] ; 

int  * itm=hdr . hdi ; 

int  index,  nds ,  nrun=0; 

double  c=3.e8; 

double  dt=l.e-9; 

double  dlmin=l.e-2; 

double  al=20.; 

double  oxt=100.4,  ozt=al; 
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double  sax=47.5,  saz=13.5; 
double  cucc  =  2 . 73448e-13  ; 
double  cul=4.07e-9; 
double  cur=. 005364; 
double  dl,  zoo4pi,  pi; 
double  cosa,  sina,  sa; 
double  eh[MAXEH]; 

double  rs[4],  rl[4],  ros[4],  rot[4]; 

double  tr,  tehs,  retard,  etc,  htc,  cu,  cud,  cvo,  tcs; 

double  *ro=&hdr . hdd [ 1] ; 

double  *  ehin=&hdr  .  hdd  [16]  ; 

double  air,  rlp2 ,  rlp3 ,  xor,  xorp2 ; 

register  double  *dp; 

FILE  *fp,  *fpd; 

size_t  sizeof_double=sizeof (double) ; 
size_t  sizeof_eh=:sizeof  (eh)  ; 
size^t  sizeof_hdr=sizeof (hdr ) ; 

extern  void  exit(int); 

extern  int  current (int  ,  double,  double  *,  double  *,  double  *); 
extern  int  getind(int,  char  *,  char  *); 

cup  =  sqrt(cul  *  cucc) ; 

cue  =  cur  /  (sqrt(cul  /  cucc)  *  2.); 

/*  INITIALIZE  PARAMETERS  */ 

pi  =  atan (1 . )  *  4  ,  ; 

zoo4pi  =  c  *  le-7; 
rs [ 1 3  =  0  .  ; 

sa  =  sqrt{sax  *  sax  +  saz  *  saz); 
cosa  =  sax  /  sa; 
sina  =  saz  /  sa; 

/*  OBTAIN  X&Z  SWITCH  DATAPOOL  FILENAME  AND  INDEX  FILENAME  */ 

if  (arge  <  4) 

{ 

printf(''Not  enough  data:  no  switch  number  or  seperate  character  or  \ 
datapool  f ilename . \n" ) ; 
exit  ( 3 )  ; 

} 

else 

{ 

switch  (argv[l][0]) 

{ 

case  '  1 ' : 

{ 

isg[0]  =  1; 

isg[l]  =  1; 

} 
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break; 
case  ' 2 ' ; 

{ 

isg[0]  =  -1; 
isg[l]  =  -1; 

} 

break; 
case  ' 3 ' : 

{ 

isg[0]  =  1; 

isg[l]  =  -1; 

} 

break; 
default : 

{ 

printf("%c:  invalid  input  for  x  switch;  \ 
must  be  1,  2,  or  3\n",  argv[l][0]); 

exit ( 1) ; 

} 

} 

switch  (argv[l][l]) 

{ 

case  ' 1 ' : 

{ 

isg[2]  =  1; 

isg[3]  =  1; 

} 

break; 
case  ' 2 ' ; 

{ 

isg[2]  =  -1; 
isg[3]  =  -1; 

} 

break; 
case  '  3 '  : 

{ 

isg[2]  =  1; 

isg[33  =  -1; 

} 

break; 
default : 

{ 

printf("%c:  invalid  input  for  z  switch;  \ 
must  be  1,  2,  or  3\n",  argv[13[l]); 
exit ( 1 ) ; 

} 

} 

air  =  {argv[l][0]  -  '0")*10  +  (argv[l][l]  ~  '0'); 

spc  =  argv [2][0]; 

strcat (strcpy ( f index,  argv[3]),  " . ind" ) ; 

if  ((  fp=f open ( f index,  "r"))  ==  NULL) 

{ 

if  ((  fp=f open { f index,  "r"))  ==  NULL) 
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{ 

fp=f Open ( f index,  "w" ) ; 
fprintf(fp,  ”%10d  %5d\n",  0,  0); 

} 

} 

f close ( fp) ; 


/*  READ  IN  CURRENT  COEFFICIENTS  OF  ALL  SEGMENTS  */ 
fp  =  f open (dscuco,  "r"  )  ; 

for  (i  =  1,  dp  =  co+20;  i  <=  MAXS;  ++i,  dp  +=20)  { 

fscanf(fp,  "%*d  %le  %le  %le  %le  %le  %le  %le  %le  %le  %le  \ 

%le  %le  %le  %le  %le  %le  %le  %le  %le  %ie", 
dp,  dp+1,  dp+2,  dp+3 ,  dp+4,  dp+5,  dp+6,  dp+7,  dp+8,  dp+9, 
dp+10,  dp+11,  dp+12,  dp+13,  dp+14,  dp+15,  dp+16,  dp+17,  dp+18,  dp+19); 

} 

f close { fp) ; 

/*  GET  HOSTNAME  */ 

gethostname (hn,  64); 

/*  READ  PARTITION  DATA  SET  NAME  OF  THE  E  AND  H  FIELD  */ 

if  (  (  fpd=f open (argv [3 ] , "rb" ) )  ==  NULL) 

{ 

printf ( "Cannot  open  datapool  file:  %s;  Program  terminated . \n" ,  argv [3]) 
exit {2 ) ; 

} 

fread((void  *)  {dseht+6),  sizeof (char ) ,  80,  fpd) ; 
fseek(fpd,  OL,  SEEK_END) ; 

nds  =  (ftell(fpd)  -  80 ) / ( 13 *sizeof_double) ; 
ib  =  (int)  strlen (dseht ) ; 

/*  READ  IN  SEGMENT  LENGTH  AND  COORDINATE  OF  OBSERVATION  POINT  */ 

while  { { index=getind (nds ,  findex,  hn) )  >  0) 

{ 

fseek(fpd,  80  +  { index-1 ) *13 *sizeof_double,  SEEK_SET) ; 
fread({void  *)  hdr.hdd,  sizeof_double ,  13,  fpd) ; 

dl  =  hdr . hdd [ 0 ] ; 

isl  =  (dl  +  dlmin  /  2.)  /  dlmin; 

print f ("%10d: %g(%+20 . 13e, %+20 . 13e , %+2 0 . 13e) \n" ,  index,  dl ,  ro [0] , 
ro[l],  ro[2]); 
f flush (stdout) ; 

/*  DETERMINE  THE  STARTING  TIME  */ 

isb  =  (isl  +  1)  >>  1; 
isgx  =  1; 
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if  (ro[0]  <  0.)  isgx  =  -1; 

rs[0]  =  isgx  *  ((isb  <<  1)  -  1)  *  (dlmin  /  2.); 

rlp2  =  ro  [0]  -  rs  [0  ]  ; 

rlp3  =  ro[2]  -  al; 

tehs  =  sqrt(rlp2  *  rlp2  +  ro[l]  *  ro[l]  +  rlp3  *  rlp3)  /  c  +  isb  *  cup; 

/*  INITIALIZE  EH  ARRAY  */ 

for  (dp  =  eh,  i  =  0;  i  <  KAXEH;  +  +  i)  *dp++  =  0.; 

/*  CALCULATE  E  AND  H  FIELD  */ 

for  (is  =  isb;  is  <=  MAXSH;  is  +=  isl)  { 
tcs  =  is  *  cup; 

tcrs  =  (  (MAXS  -  is  «  1)  +  1)  *  cup; 

snO  =  is  -  1; 
snl  =  is; 

rnO  =  MAXS2  -  (is  -  1); 
rnl  =  MAXS2  -  is; 

for  (isgz  =  isg[2];  isgz  >=  isg[3];  isgz  +=  -2)  { 

for  (isgx  =  isg[0};  isgx  >=  isg[l];  isgx  +=  -2)  { 

rs[0]  =  isgx  *  ((is  <<  1)  -  1)  *  (dlmin  /  2.); 

rs[2]  =  isgz  *  al; 

rl  [0]  =  ro  [0]  -  rs  [0]  ; 
rl[l]  =  ro[l]  -  rs[l] ; 
rl [2 ]  =  ro [2 ]  -  rs [2 ] ; 

rl[3]  =  sqrt(rl[0]  *  rl[0]  +  rl[l]  *  rl[l]  +  rl[2]  *  rl[2]); 

retard  =  rl[3]  /  c  +  tcs  -  tehs ; 

its  =  retard  /  dt  +  1.; 
for  (it  =  its;  it  <=  MAXT;  ++it)  { 

tr  =  it  *  dt  -  retard; 
current(is,  tr,  &cu,  &cud,  Sccvo)  ; 

rlp2  =  rl[3]  *  rl [3] ; 
rlp3  =  rl[3]  *  rlp2; 
xor  =  rl[0]  /  rl[3] ; 
xorp2  =  xor  *  xor; 
dp  =  &eh[it*8]  ; 

etc  =  isgz  *  zoo4pi  *  dl  *  (cud  /  c  /  rl[3]  +  cu  /  rlp2  * 

3.  +  c  *  cvo  /  rlp3  *  1.5  ); 

*dp++  +=  isgz  *  zoo4pi  *  dl  *  (cud  /  c  /  rl[3]  * 

(xorp2  -  1.)  +  (cu  /  rlp2  + 

cvo  /  2.  *  c  /  rlp3)  *  (xorp2  *  3.  -  1.)  ); 

*dp++  +=  rl[l]  /  rl[3]  *  xor  *  etc; 

*dp++  +=  rl[2]  /  rl[3]  *  xor  *  etc; 

htc  =  isgz  *  dl  *  (cud  /  c  /  rl[3]  +  cu  /  rlp2)/(pi  *  4.); 
++dp; 

*dp++  ~=  rl[2]  /  rl[3]  *  htc; 

*dp  +=  rl[l]  /  rl[3]  *  htc; 

} 

} 

} 
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/♦  CALCULATE  THE  FIELD  CONTRIBUTE  BY  THE  SLANT  PARTS  */ 

rs [ 2 ]  =  0 . ; 

rot [ 1 ]  =  ro [1] ; 

for  (;  is  <=  MAXS;  is  +=  isl) 

{ 

tcs  =  is  *  cup; 

tcrs  =  ((MAXS  -  is  <<  1)  +  1)  *  cup; 

snO  =  is  -  1; 
snl  =  is; 

rnO  =  MAXS2  -  (is  -  1); 
rnl  =  MAXS2  -  is; 

for  (isgz  =  isg[2];  isgz  >=  isg[3];  isgz  +=  -2) 

{ 

for  (isgx  =  isg[0];  isgx  >=  isg[l];  isgx  +=  -2) 

{ 


ros [ 0 ] 

=  ro[03 

-  isgx  *  oxt; 

ros [2] 

=  ro[2] 

-  isgz  *  ozt; 

rot [0] 

=  cosa 

*  ros[0]  -  isgx  *  isgz 

★ 

sina  *  ros [2 ] ; 

rot  [2] 

=  isgx 

*  isgz  *  sina  *  ros[0] 

+ 

cosa  *  ros [2 ] ; 

rs[0]  = 

isgx  * 

{(is  -  MAXSH  «  1)  - 

1) 

*  (dlmin  /  2.) ; 

rl[0]  = 

rot [0] 

-  rs[0]; 

rl[l]  = 

rot [1] 

-  rs[l] ; 

rl[2]  = 

rot [2] 

-  rs  [2  j  ; 

rl[33  = 

sqrt(rl[0]  *  rltO]  +  rl[l]  * 

rl[l]  +  rl[2]  *  rl[2] ) 

retard  ^ 

=  rl[3] 

/  c  +  tcs  ~  tehs; 

its  =  retard  / 

dt  +  1 . ; 

for  (it 

=  its; 

it  <=  MAXT;  ++it) 

{ 

tr  =  it  *  dt  -  retard; 
current(is,  tr,  &:cu,  icud,  &cvo)  ; 

rlp2  =  rl[3]  *  rl[3] ; 
rlp3  =  rl[3]  *  rlp2; 
xor  =  rl[0]  /  rl[3] ; 
xorp2  =  xor  *  xor; 
dp  =  &eh [ it*8 ] ; 

etc  =  isgz  *  zoo4pi  *  dl  *  (cud  /  c  /  rl[3]  +  cu  /  rlp2  * 
3 .  +  c  *  cvo  /  rlp3  *  1,5  )  ; 
ehin[0]  =  isgz  *  zoo4pi  *  dl  *  (cud  /  c  /  rl[3]  * 

(xorp2  -  1.)  +  (cu  /  rlp2  + 

cvo  /  2.  *  c  /  rlp3 )  *  (xorp2  *  3.  -  1.)  ); 

ehin[2]  =  rl[2]  /  rl[3]  *  xor  *  etc; 

*dp++  +=  cosa  *  ehm[0]  +  isgx  *  isgz  *  sina  *  ehin[2]; 
*dp++  +=  rl[l]  /  rl[3]  *  xor  *  etc; 

*dp++  isgx  *  isgz  *  sina  *  ehm[0]  +  cosa  *  ehm[2]; 

htc  =  isgz  *  dl  *  (cud  /  c  /  rl[3]  +  cu  /  rlp2)/(pi  *  4.) 
ehin[5]  =  rl[l]  /  rl[3]  *  htc; 

*dp++  +=  isgx  *  isgz  *  sina  *  ehin[5]; 
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*dp++  -=  rl[2]  /  rl[3]  *  htc; 
*dp  +=  cosa  *  ehin[5]; 

} 

} 

} 

} 


/*  DETERMINE  THE  MAX  VALUES  OF  E  &  H  FIELDS  AND  THEIR  TIMES  */ 

for  (i  =  0;  i  <  8;  ++i) 

{ 

ehm  [  i  ]  =  0  .  ; 

i  tm  [  i  ]  =  0  ; 


for  (dp  =  eh,  it  =  0;  it  <=  MAXT;  ++it) 


{ 


dp[6] 
dp  [7] 
for  (i 


sqrt (dp [ 0 ] 
sqrt (dp [3] 
0;  i  <  6; 


*  dp  [  0  3  +  dp  [  1 3  *  dp  [  1 3  +  dp  [  2  3 

*  dp  [  3  3  +  dp  [  4  ]  *  dp  [  4  3  +  dp  [  5  3 

++dp,  ++i) 


if  (fabs(*dp)  >  fabs (ehm [i 3 ) ) 

{ 


elim[i3  =  *dp; 
i  tm  [  i  3  =  i  t  ; 


} 


dp[2  3 )  ; 
dp [53)7 


for  (i  =  6;  i  <  8;  ++dp,  ++i) 

{ 

if  ( *dp  >  ehm [ i 3  ) 

{ 

ehm [ i 3  =  *dp; 
i tm [ i 3  =  it; 

} 

} 

} 

/*  SET  A  DATA  SET  NAME  FOR  E  AND  H  FIELD  */ 

sprintf (dseht+ib, "%cz%+20 . 13e%cy%+20 . 13e%cx%+20 . 13 e" ,  spc,  ro [23 / 
spc ,  ro [ 1 3 ,  spc ,  ro [ 0  3 ) ; 

memmove ( ( void  *)  hdr.hdd,  (void  *)  &hdr.hdd[13/  6^sizeof_double) ; 

memmove ( (void  *)  &hdr.hdd[83/  (void  *)  &hdr.hdd[7],  6*sizeof_double) ; 

hdr.hdd[63  =  dl; 

hdr.hdd [7 3  =  dt ; 

hdr.hdd [14 3  =  tehs; 

hdr.hdd [15 3  =  air; 

/*  WRITE  E  AND  H  FIELD  TO  FILE  */ 

if  (  (fp=:fopen(dseht  +  6,  "wb")  )  ==  NULL) 

{ 
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dseht[ib+44]  =  '\0'; 

/*  check  if  y  directory  is  created  or  not  */ 

if  ( ( fp=f open (dseht+6 , " r " ) )  ==  NULL) 

{ 

dseht[ib+223  =  '\0'; 

/*  check  if  z  directory  is  created  or  not  */ 

if  ( ( fp=fopen (dseht+6 , "r" ) )  ==  NULL) 

{ 

dseht [ ib]  =  ' \0 '  ; 

/*  check  if  base  directory  is  created  or  not  */ 

if  (  ( fp=fopen (dseht  +  6 , "r" ) )  ==  NULL) 

{ 

if  (system(dseht)  )  /*  create  base  directory 

{ 

printf ( "Cannot  create  directory:  %s;  \ 

Program  terminated . \n" ,  dseht+6); 

exi t { 1 ) ; 

} 

} 

else 

fclose(fp);  /*  base  directory  is  already  Greater 

dseht [ib]  =  spc; 

if  {system (dseht ) )  /*  create  z  directory  */ 

{ 

printf ( "Cannot  create  directory:  %s;  \ 

Program  terminated . \n"  ,  dseht  +  6); 

exit (1) ; 

} 

} 

else 

fclose(fp);  /*  z  directory  is  already  created  */ 

dseht [ib+22]  =  spc; 

if  (system(dseht) )  /*  create  y  directory  */ 

{ 

printf ( "Cannot  create  directory:  %s;  Program  terminated . \n 
dseht+6) ; 

exit  (1 ) ; 

} 

} 

else 

{ 

dseht [ib+44]  =  spc; 

printf ( "Cannot  create  file:  %s;  Program  terminated. \n" , 
dseht+6) ; 

exi t ( 1 ) ; 

} 
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dseht[ib+44]  =  spc; 

fp  =  f open {dseht+6 , "wb" ) ; 

} 

fwrite((void  *)  &hdr,  sizeof_hdr,  1,  fp) ; 
fwrite((void  *)  eh,  sizeof_eh,  1,  fp)  ; 
f close ( fp) ; 

++nrun; 

printf  ( "%10d>  %s\n'',  nrun,  dseht+6); 

} 

f close ( fpd) ; 
switch  (index) 

{ 

case  0: 

printf ( "Program  is  successfully  completed. \n" )  ; 
break; 
case  -1: 

printf ("All  processes  were  instructed  to  terminate . \n" ) ; 
break; 
case  -2 : 

printf ( "Program  was  instructed  to  terminate . \n" ) ; 
break; 
case  -3: 

printf ( "Cannot  obtain  index  file:  %s;  Program  terminated . \n" ,  f index) ; 
break; 
default : 

printf (" Program  abnormally  terminated  with  index  =  %d\n" ,  index); 

} 

} 

/*  *  SUBROUTINE  CALCULATE  THE  DIPOLE'S  */ 

/*  *  CURRENT,  */ 

/*  *  CURRENT  DERIVATIVE,  */ 

/*  *  CONVOLUTION  OF  SIGN  FUNCTION  AND  CURRENT  */ 


int  current ( int  ns,  double  ts,  double  *y,  double  *yd,  double  *yco) 

{ 

int  Ir; 

double  ys,  ysd,  ysco,  t; 

extern  int  curdis(int,  int,  double,  double,  double  *,  double  *,  double  *); 


Ir  =  0; 

curdis(ns,  Ir,  snl, 
t  =  ts  +  cup; 
curdis(ns,  Ir,  snO , 
*y  -=  ys; 

*yd  -=  ysd; 

*yco  -=  ysco; 


ts,  &ys ,  &ysd,  &ysco) ; 
t ,  y ,  yd ,  yco ) ; 
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/*  REFLECTION  CURRENT  IS  NOT  EMERGING  */ 
if  (  ( t  =  ts-tcrs) <  =  0 . )  return  0; 

/*  REFLECTION  CURRENT  */ 

Ir  =  12; 

curdis(ns,  Ir,  rnO,  t,  &ys,  &ysd,  &ysco) ; 
*y  +=  ys; 

*yd  +=  ysd; 

'*'yco  +=  ysco; 
t  +=  cup; 

curdis(ns,  Ir,  rnl,  t,  &ys,  &ysd,  &ysco) ; 

*y  -=  ys; 

*yd  -=:  ysd; 

*yco  -=  ysco; 
return  0; 

}  /*  current  */ 


/*  *  SUBROUTINE  CALCULATES  CURRENT  PARAMETERS  **  */ 


int  curdis(int  ns,  int  ir,  double  sn,  double  t, 

double  *ys ,  double  *ysd,  double  *ysco) 

{ 

register  double  *dp; 


double 

aa,  ab,  ac ,  ad, 

pd,  af; 

double 

ea,  eb,  ec,  ed, 

edp,ef; 

double 

ar,  er; 

/* 

AA 

= 

CO(  0+IR,NS) 

*/ 

/* 

AB 

CO(  1+IR,NS) 

*/ 

/* 

AC 

CO(  2+IR,NS) 

*/ 

/* 

AD 

CO{  3+IR,NS) 

*/ 

/* 

PD 

= 

CO{  4+IR,NS) 

*/ 

/* 

AF 

= 

CO(  5+IR,NS) 

*/ 

/* 

EA 

CO(  6, NS)  */ 

/* 

EB 

CO{  7, NS)  */ 

/* 

EC 

= 

CO(  8, NS)  */ 

/* 

ED 

= 

CO(  9, NS)  */ 

/* 

EDP 

= 

CO(10,NS)  */ 

/* 

EF 

= 

CO(ll,NS)  */ 

/* 

RA 

= 

CO(12,NS)  */ 

/* 

RB 

= 

CO(13,NS)  */ 

/* 

RC 

= 

CO (14, NS)  */ 

/* 

RD 

= 

CO{15,NS)  */ 

/* 

PDR 

= 

CO(16,NS)  */ 

/* 

RF 

= 

CO{17,NS)  */ 

/* 

AR 

= 

CO (18, NS)  */ 

/* 

ER 

= 

CO(19,NS)  */ 
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/*  CURRENT  DISTRIBUTION  */ 


dp 

aa 

ab 

ac 

ad 

pd 

af 

ea 

eb 

ec 

ed 

edp 

ef 

ar 

er 


Seco  [ns 

*20 

*  (dp 

+ 

0 

f 

ir) 

*  (dp 

+ 

1 

f 

ir) 

*  (dp 

+ 

2 

f 

ir) 

*  (dp 

+ 

3 

f 

ir) 

*  (dp 

+ 

4 

f 

ir) 

*  (dp 

+ 

5 

f 

ir) 

*  (dp 

+ 

6) 

*  (dp 

+ 

7) 

*  (dp 

8) 

*  (dp 

+ 

9) 

*  (dp 

+  10) 

*  (dp 

+  11) 

*  (dp 

+  18) 

*  (dp 

+  19) 

*ys  =  exp  ( -  sn  *  cue)  *  (  aa  * 

exp(ea  * 

t) 

+ 

ab  * 

exp ( eb  * 

t) 

+ 

ac  * 

exp ( ec  * 

t) 

+ 

af  * 

exp ( e  f  * 

t) 

+ 

2 .  *  ad  * 

exp ( ed  * 

t) 

*  cos (edp 

/*  DERIVATIVE  OF  CURRENT  DISTRIBUTION  */ 


*ysd  =  exp(-  sn  *  cue)  *  (  aa  *  €xp(ea  *  t)  *  ea  + 

ab  *  exp  ( eb  *  t )  *  eb  + 

ac  exp(ec  *  t)  *  ec  + 

af  *  exp(ef  *  t)  *  ef  + 

2 .  *  ad  *  exp ( ed  *  t )  * 

(ed  *  cos (edp  *  t  +  pd)  -  edp  *  sin (edp  * 


/*  CONVOLUTION  OF  SGN(T)  AND  I  (T)  */ 

*ysco  =  exp(-  sn  *  cue)  *  (  aa  *  (2.  *  exp{ea  *  t)  -1.) 

ab  *  ( 2  .  *  exp  ( eb  *  t )  - 1 . ) 

ac  *  (2.  *  exp(ec  *  t)  -1.) 

af  *  (2,  *  exp{ef  *  t)  -1.) 


2.  *  ad  *  (2.  * 

exp ( ed 

*  t)  * 

(ed  * 

cos (edp  *  t  +  pd) 

+  edp  * 

sin (edp 

ed  * 

cos (pd) 

-  edp  * 

sin (pd) 

(ed  * 

ed  +  edp  *  edp)  ) ; 

f  (ir  == 

=  0) 

return 

0; 

ys  +  = 

exp  ( 

~  sn  * 

cue) 

*  ar 

* 

exp (er 

*  t)  ; 

■ysd  +  = 

exp  ( 

-  sn  * 

cue) 

*  ar 

* 

exp (er 

*  t)  *  er 

ysco  += 

exp  ( 

-  sn  * 

cue) 

*  ar 

* 

(2.  * 

exp(er  *  t 

return  0; 

}  /*  curdis  */ 


t  +  pd)  ) ; 


+  pd)  )  )  ; 


/  ea  + 

/  eb  + 

/  ec  + 

/  ef  + 

t  +  pd)  )  - 

)  / 


er ; 
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FUNCTION  getind 

TO  OBTAIN  INDEX  FOR  DATA  POOL 


#include  <unistd.h> 
#include  <fcntl.h> 

#ifdGf  _H_FCNTL 
#include  <sys/lockf . h> 

# define  0_RDWR  2 

#endif 


int  getind(int  nds ,  char  *fn,  char  *hn) 

{ 

struct  {  char  name [64];  int  ext;}  hst[25]; 
int  fd,  i,j,  index,  sflag,  try; 

FILE  *fp; 

try  =  0; 

fp  =  fopen(fn,  "r+"); 
fd  =  fp->_file; 

while  (lockf{fd,  F_TLOCK,  OL)  <  0) 

{ 

if(++try  >  26) 

{ 

f close { fp) ; 
return  -3 ; 

} 

sleep ( ( try>5 ) ?5 : try) ; 

} 

fscanf(fp,  "%d  %d",  &index,  &sflag) ; 
if  (index  >=  nds) 
index  =  0 ; 
else  if  (sflag) 
sflag  =  -1; 
else 
{ 

i  =  0; 

while  {fscanf{fp,  "%s  %d" ,  hst[i].name,  &hst[i].ext)  !=  EOF) 

{ 

if  (strcmp (hst [i] .name,hn)  ==  0) 

{ 

sflag  =  -2; 

if  {-hst[i].ext  ==  0)  -i; 

} 

+  +  i  ; 

} 

if  (sflag  ==  0) 

++ index; 
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freopen(fn,  "w+'',  fp)  ; 

fprintf(fp,  "%10d  %5d\n" ,  index,  0); 

for  (j=0;  j  <  i;  ++ j ) 
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fprintf(fp,  "%-64s  %3d\n",  hs t [ j ] . name ,  hst[j].ext); 

} 

f close ( fp) ; 

return  sflag  ?  sflag  :  index; 
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Appendix  B.  LINPACK  Benchmarking  of  Workstations 

The  performance  of  processors  in  the  Electromagnetic  Effects  Modeling 
System  (EeMS)  was  benchmarked  with  the  LINPACK  benchmark. 
LINPACK  is  an  industry  benchmark  that  measures  the  floating-point  per¬ 
formance  of  computer  systems.  The  LINPACK  benchmark  in  the  C  pro¬ 
gramming  language  was  used,  since  this  electromagnetic  application  was 
written  entirely  in  the  C  language.  The  LINPACK  code,  linpack.c,  was  ob¬ 
tained  from  the  netlib.att.com  machine  on  the  Internet.  The  benchmark 
used  15-digit  double  precision  (8-byte  representation)  and  200  by  200  array 
elements,  which  required  315  kbytes  of  system  RAM  (random  access 
memory).  The  LINPACK  benchmark  was  submitted  to  processors  in  batch 
mode  during  a  period  of  light  user  activity.  Tables  B-1  to  B-4  present  the 
results  of  LINPACK  benchmarking  on  four  different  types  of  workstations 
in  the  EeMS. 

The  tables  also  list  the  time  percentage  of  the  overhead  and  two  main  rou¬ 
tines  of  the  benchmark  program,  DGEFA  and  DGESL,  in  which  the  major¬ 
ity  of  floating-point  operations  are  performed.  The  DGEFA  function  is 
used  to  factor  a  double  precision  matrix  by  the  use  of  Gaussian  elimina¬ 
tion.  The  DGESL  function  is  for  solving  the  double  precision  system 
(AX  =  B  or  A^X  =  B). 


Table  B-l.  Average 
rolled  and  unrolled 
performance  for  an 
SGI  4D/35  system. 

Reps 

Time  (s) 

DGEFA 

(%) 

DGESL 

(%) 

Overhead 

(%) 

Kflops 

2 

0.79 

86.08 

3.80 

10.13 

3868.545 

4 

1.58 

87.34 

3.80 

8.86 

3814.815 

8 

3.13 

88.18 

2.56 

9.27 

3868.545 

16 

6.31 

87.16 

3.01 

9.83 

3861.746 

32 

12.58 

87.04 

3.02 

9.94 

3878.788 

Table  B-2.  Average 
rolled  and  unrolled 
performance  for  an 
SGI  4D/440  system. 


Reps 

Time  (s) 

DGEFA 

(%) 

DGESL 

(%) 

Overhead 

(%) 

Kflops 

2 

0.71 

85.92 

2.82 

11.27 

4359.788 

4 

1.40 

87.86 

2.14 

10.00 

4359.788 

8 

2.83 

87.28 

2.83 

9.89 

4308.497 

16 

5.64 

87.23 

3.01 

9.75 

4316.961 

32 

11.29 

86.71 

3.28 

10.01 

4325.459 
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Table  B-3.  Average 
rolled  and  unrolled 
performance  for  an 
IBM  RS/6000-530 


system. 


Reps 

Time  (s) 

DGEFA 

(%) 

DGESL 

(%) 

Overhead 

(%) 

Kflops 

4 

0.57 

84.21 

0.00 

15.79 

11444.444 

8 

1.15 

75.65 

7.83 

16.52 

11444.444 

16 

2.34 

81.62 

1.71 

16.67 

11268.376 

32 

4.61 

82.21 

1.74 

16.05 

11355.728 

64 

9.20 

76.85 

1.20 

21.96 

12241.411 

128 

18.50 

78.49 

2.81 

18.70 

11687.943 

Table  B-4.  Average 
rolled  and  unrolled 
performance  for  an 
IBM  RS/6000-560 
system. 


Reps 

Time  (s) 

DGEFA 

(%) 

DGESL 

(%) 

Overhead 

(%) 

Kflops 

8 

0.58 

82.76 

0.00 

17.24 

22888.889 

16 

1.15 

82.61 

0.87 

16.52 

22888.889 

32 

2.30 

75.65 

1.74 

22.61 

24689,139 

64 

4.61 

80.48 

3.47 

16.05 

22711.456 

128 

9.21 

82.41 

1.41 

16.18 

22770.294 

256 

18.40 

79.08 

4.73 

16.20 

22799.827 
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Appendix  C.  Application  Benchmarking  of  Workstations 

The  electromagnetic  (EM)  application  program  named  "dob"  was  bench- 
marked  on  the  four  types  of  workstations  in  the  Electromagnetic  Effects 
Measurement  System  (EeMS):  SGI  4D/35,  SGI  4D/ 440,  IBM  RS/ 6000-530, 
and  IBM  RS/6000-560.  On  each  system,  the  benchmark  program  used  the 
same  input  data  sets,  which  were  based  on  three  typical  observation 
points.  The  benchmark  tests  were  executed  during  a  period  in  which  few 
or  no  other  user's  jobs  were  being  run,  such  as  at  night  or  on  the  weekend. 
The  total  execution  time  on  each  system  was  computed  in  two  ways:  one 
based  on  the  time  stamp  on  files  and  the  other  based  on  the  UNIX  "timex" 
command.  The  average  execution  time  for  one  data  set  is  determined 
based  on  the  total  execution  time  of  all  three  data  sets.  These  two  methods 
produced  average  execution  times  that  were  very  close.  In  the  analysis  of 
the  distributed  computing  technique  (DCT),  the  average  execution  time 
based  on  the  UNIX  "timex"  command  was  used,  but  the  fractions  of  sec¬ 
onds  were  discarded. 


Table  C-1.  EM 
application 
benchmarking 
recorded  using  file 
time  stamp. 


Table  C-2.  Average 
execution  time  for  one 
data  set. 


Execution  time  of  benchmark  on 

Benchmark  data 

SGI 

SGI 

IBM 

IBM 

(data  point) 

4D/35 

4D/440 

RS/6000-530  RS/6000-560 

(0,50,20) 

10:06:32 

9:15:02 

7:17:39 

3:36:55 

(1000,0300) 

11:10:54 

10:13:34 

8:01:16 

3:58:33 

(100,100,100) 

10:40:50 

9:45:19 

7:40:22 

3:48:08 

Average  execution  time  of  benchmark  on 

Computed 

SGI 

SGI 

IBM 

IBM 

based  on 

4D/35 

4D/440 

RS/6000-530 

RS/6000-560 

file  time  stamp 

10:39:25.33 

9:44:38.33 

7:39:45.67 

3:47:52.00 

timex  command 

10:39:25.78 

9:44:28.89 

7:39:45.47 

3:47:52.13 
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